Naïve Bayes Classification Ensembles to Support Modeling Decisions in Data Stream Mining
نویسنده
چکیده
Data stream mining is the process of applying data mining methods to a data stream in real-time in order to create descriptive or predictive models. Due to the dynamic nature of data streams, new classes may emerge as a data stream evolves, and the concept being modeled may change with time. This gives rise to the need to continuously make revisions to the predictive model. Revising the predictive model requires that labeled training data should be available. Manual labeling of training data may not be able to cope with the speed at which data needs to be labeled. This paper proposes a predictive modeling framework which supports two of the common decisions that need to be made in stream mining. These decisions are: (1) determining when model revision should be performed and (2) deciding which newly arrived instances should be used as training data. The framework consists of an online component and an offline component. The online component uses Naïve Bayes ensemble base models to make predictions for newly arrived data stream instances. The offline component consists of algorithms to combine base model predictions, determine the reliability of the ensemble predictions, select training data for new base models, create new base models, and determine whether the current online base models need to be replaced.
منابع مشابه
Comparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment
In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...
متن کاملComparison of Decision Tree and Naïve Bayes Methods in Classification of Researcher’s Cognitive Styles in Academic Environment
In today world of internet, it is important to feedback the users based on what they demand. Moreover, one of the important tasks in data mining is classification. Today, there are several classification techniques in order to solve the classification problems like Genetic Algorithm, Decision Tree, Bayesian and others. In this article, it is attempted to classify researchers to “Expert” and “No...
متن کاملADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION
With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...
متن کاملS3PSO: Students’ Performance Prediction Based on Particle Swarm Optimization
Nowadays, new methods are required to take advantage of the rich and extensive gold mine of data given the vast content of data particularly created by educational systems. Data mining algorithms have been used in educational systems especially e-learning systems due to the broad usage of these systems. Providing a model to predict final student results in educational course is a reason for usi...
متن کاملUsing Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council
Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...
متن کامل